Dictionary
Search is surely the most frequenly performed action by computers. Let's consider a simple dictionary ADT that represent data structure that can do search and store information.
Dictionary ADT
trait Dictionary[Key, E] {
def clear(): Unit
def insert(k: Key, e: E): Unit
def remove(k: Key): Option[E]
def removeAny(): Option[E]
def find(k: Key): Option[E]
def size: Int
}
Binary search tree (BST) node ADT
To implement dictionary, here I use binary search tree. First, I need to define node of the tree.
trait BinNode[E] {
def element: E
def setElement(newElement: E): Unit
def left: Option[BinNode[E]]
def right: Option[BinNode[E]]
def isLeaf: Boolean
}
BST node implementation
class BSTNode[Key, E](private var k: Key, private var e: E,
private var l: Option[BSTNode[Key, E]] = None,
private var r: Option[BSTNode[Key, E]] = None) extends BinNode[E] {
def key: Key = k
def setKey(newKey: Key): Unit = k = newKey
def element: E = e
def setElement(newElement: E): Unit = e = newElement
def left: Option[BSTNode[Key, E]] = l
def setLeft(newLeft: Option[BSTNode[Key, E]]): Unit = l = newLeft
def right: Option[BSTNode[Key, E]] = r
def setRight(newRight: Option[BSTNode[Key, E]]): Unit = r = newRight
def isLeaf: Boolean = left.isEmpty && right.isEmpty
}
BST implementation
class BST[Key, E](implicit val ordering: Ordering[Key])
extends Dictionary[Key, E] {
var root: Option[BSTNode[Key, E]] = None
var nodeCount = 0
def clear(): Unit = {
root = None
nodeCount = 0
}
def insert(key: Key, element: E): Unit = {
root = insertRec(root, key, element)
nodeCount += 1
}
def remove(key: Key): Option[E] = {
val temp = findRec(root, key)
temp match {
case Some(_) =>
root = removeRec(root, key)
nodeCount -= 1
case None =>
}
temp
}
def removeAny(): Option[E] = {
if (root.isEmpty) return None
val temp = root.get.element
root = removeRec(root, root.get.key)
nodeCount -= 1
Some(temp)
}
def find(key: Key): Option[E] = findRec(root, key)
def size: Int = nodeCount
def findRec(root: Option[BSTNode[Key, E]],
key: Key): Option[E] = {
if (root.isEmpty) return None
root.get.key match {
case rootKey if ordering.compare(
rootKey, key) == 0 => Some(root.get.element)
case rootKey if ordering.compare(
rootKey, key) > 0 => findRec(root.get.left, key)
case _ => findRec(root.get.right, key)
}
}
def insertRec(root: Option[BSTNode[Key, E]],
key: Key, element: E): Option[BSTNode[Key, E]] = {
if (root.isEmpty) return Some(new BSTNode[Key, E](key, element))
root.get match {
case node if
ordering.compare(root.get.key, key) > 0 => node.setLeft(insertRec(node.left, key, element))
case node => node.setRight(insertRec(node.right, key, element))
}
root
}
def removeRec(root: Option[BSTNode[Key, E]], key: Key): Option[BSTNode[Key, E]] = {
if (root.isEmpty) return None
root.get.key match {
case rootKey if ordering.compare(
rootKey, key) > 0 => root.get.setLeft(removeRec(root.get.left, key))
case rootKey if ordering.compare(
rootKey, key) < 0 => root.get.setRight(removeRec(root.get.right, key))
case _ =>
if (root.get.left.isEmpty) return root.get.right
if (root.get.right.isEmpty) return root.get.left
val temp: BSTNode[Key, E] = getMin(root.get.right)
root.get.setElement(temp.element)
root.get.setKey(temp.key)
root.get.setRight(deleteMin(root.get.right))
}
root
}
// called after findRec find the key so that there must be the key
def getMin(root: Option[BSTNode[Key, E]]): BSTNode[Key, E] = {
root.get.left match {
case None => root.get
case Some(node) => getMin(root.get.left)
}
}
// called after findRec find the key so that there must be the key
def deleteMin(root: Option[BSTNode[Key, E]]): Option[BSTNode[Key, E]] = {
root.get.left match {
case None => root.get.right
case Some(node) => deleteMin(root.get.left)
}
}
}
Advantages and disadvantages of BST
Searching can be done with the data structures such as (un)sorted array, hashtable or BST. BST is fast compared to arrays to perform insert of find element. And only BST can handle range query that one want to find element within a certain range.
BST become slow when the tree is unbalanced and takes $\Theta(n^2)$ time to perform insert and find in the worst case. Fortunately, it can be augmented so that it always maintain the balanced property. The data structure is called AVL tree.